fuzzy synset
An Algorithm for Fuzzification of WordNets, Supported by a Mathematical Proof
Hossayni, Sayyed-Ali, Akbarzadeh-T, Mohammad-R, Recupero, Diego Reforgiato, Gangemi, Aldo, Del Acebo, Esteve, Esteva, Josep Lluís de la Rosa i
WordNet-like Lexical Databases (WLDs) group English words into sets of synonyms called "synsets." Although the standard WLDs are being used in many successful Text-Mining applications, they have the limitation that word-senses are considered to represent the meaning associated to their corresponding synsets, to the same degree, which is not generally true. In order to overcome this limitation, several fuzzy versions of synsets have been proposed. A common trait of these studies is that, to the best of our knowledge, they do not aim to produce fuzzified versions of the existing WLD's, but build new WLDs from scratch, which has limited the attention received from the Text-Mining community, many of whose resources and applications are based on the existing WLDs. In this study, we present an algorithm for constructing fuzzy versions of WLDs of any language, given a corpus of documents and a word-sense disambiguation (WSD) system for that language. Then, using the Open-American-National-Corpus and UKB WSD as algorithm inputs, we construct and publish online the fuzzified version of English WordNet (FWN). We also propose a theoretical (mathematical) proof of the validity of its results.
- Europe > Spain > Catalonia > Girona Province > Girona (0.04)
- Europe > Italy > Sardinia > Cagliari (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (2 more...)
Automatic Discovery of Fuzzy Synsets from Dictionary Definitions
Oliveira, Hugo Gonçalo (University of Coimbra) | Gomes, Paulo (University of Coimbra)
In order to deal with ambiguity in natural language, it is common to organise words, according to their senses, in synsets, which are groups of synonymous words that can be seen as concepts. The manual creation of a broad-coverage synset base is a time-consuming task, so we take advantage of dictionary definitions for extracting synonymy pairs and clustering for identifying synsets. Since word senses are not discrete, we create fuzzy synsets, where each word has a membership degree. We report on the results of the creation of a fuzzy synset base for Portuguese, from three electronic dictionaries. The resulting resource is larger than existing hancrafted Portuguese thesauri.
- Europe > Middle East > Malta (0.04)
- South America > Brazil (0.04)
- North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
- (7 more...)